Compiling Boostexter Rules into a Finite-state Transducer
نویسنده
چکیده
A number of NLP tasks have been effectively modeled as classification tasks using a variety of classification techniques. Most of these tasks have been pursued in isolation with the classifier assuming unambiguous input. In order for these techniques to be more broadly applicable, they need to be extended to apply on weighted packed representations of ambiguous input. One approach for achieving this is to represent the classification model as a weighted finite-state transducer (WFST). In this paper, we present a compilation procedure to convert the rules resulting from an AdaBoost classifier into an WFST. We validate the compilation technique by applying the resulting WFST on a call-routing application.
منابع مشابه
On Finite-State Tonology with Autosegmental Representations
Building finite-state transducers from written autosegmental grammars of tonal languages involves compiling the rules into a notation provided by the finitestate tools. This work tests a simple, human readable approach to compile and debug autosegmental rules using a simple string encoding for autosegmental representations. The proposal is based on brackets that mark the edges of the tone autos...
متن کاملThe OpenGrm open-source finite-state grammar software libraries
In this paper, we present a new collection of open-source software libraries that provides command line binary utilities and library classes and functions for compiling regular expression and context-sensitive rewrite rules into finite-state transducers, and for n-gram language modeling. The OpenGrm libraries use the OpenFst library to provide an efficient encoding of grammars and general algor...
متن کاملAn Efficient Compiler for Weighted Rewrite Rules
Context-dependent rewrite rules are used in many areas of natural language and speech processing. Work in computational phonology has demonstrated that, given certain conditions, such rewrite rules can be represented as finite-state transducers (FSTs). We describe a new algorithm for compiling rewrite rules into FSTs. We show the algorithm to be simpler and more efficient than existing algorith...
متن کاملHFST - Framework for Compiling and Applying Morphologies
HFST–Helsinki Finite-State Technology (hfst.sf.net) is a framework for compiling and applying linguistic descriptions with finite-state methods. HFST currently connects some of the most important finite-state tools for creating morphologies and spellers into one open-source platform and supports extending and improving the descriptions with weights to accommodate the modeling of statistical inf...
متن کاملA Method for Compiling Two-Level Rules with Multiple Contexts
A novel method is presented for compiling two-level rules which have multiple context parts. The same method can also be applied to the resolution of so-called right-arrow rule conflicts. The method makes use of the fact that one can efficiently compose sets of twolevel rules with a lexicon transducer. By introducing variant characters and using simple pre-processing of multi-context rules, all...
متن کامل